DOCUMENT RESUME 



ED 305 401 



TM 013 029 



AUTHOR 
TITLE 



Pilotte, William J.; Gable, Robert K. 

Using Confirmatory Factor Analysis To Study the 

Impact of Mixed Item Stems on a Computer Anxiety 

Scale . 

Feb 89 

28p.; Paper presented at the Annual Meeting of the 
Eastern Educational Research Association (Savannah, 
GA, February 1989) . 
Reports - Research/Technical (143) — 
Speeches/conference Papers (150) 



PUB DATE 
NOTE 



PUB TYPE 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



MF01/PC02 Plus Postage* 

*Affective Measures; Computer Science Education; 
Error of Measurement; * Factor Analysis; Factor 
Structure; Goodness of Fit; High Schools; High School 
Students; *Item Analysis; *Rating Scales; Test 3ias; 
Test Items 

*Computer Anxiety Scale; Confirmatory Factor 
Analysis; Factor Invariance; LISREL Computer Program; 
Parallel Test Forms 



ABSTRACT 



Confirmatory factor analysis (LISREL VI) is the 



method best suited to the comparison of measurement models when those 
models are based on a priori assumptions* Traditionally, positive and 
negative item stems were mixed on affective scales to reduce response 
set bias since the item pairs were considered to be parallel* Recent 
studies indicate that positive and negative item stems may form 
separate factors, implying that they represent different constructs* 
In this study, the differences between positive and negative item 
stems were assessed using two forms of a computer anxiety scale to 
ascertain if the negation of an item produces a parallel item and to 
compare the factor structures and measurement errors to determine if 
factor invariance can be claimed. Three forms (Forms A, B, and C) of 
a computer anxiety scale were administered to a random sample of 
students (20 homerooms) at a small city high school. Reverse scoring 
was usad for all items on Form B and for appropriate items on Form C. 
The results are consistent with those of other researchers, providing 
more evidence that the use of reverse scored items on an affective 
scale can alter students' responses to an item. One should view 
results with caution when the instrument includes mixed item stems, 
since the negation of an item tends to lead to an increase in the 
error variance related to the item. In general, positive and negative 
forms of this scale do not meet the criteria for factor invariance or 
for parallel tests. (TJH) 



********************************************************************* 

* Reproductions supplied by EDRS are the best that can be made 

* from the original document. 



IMPACT OF MIXED STEMS 



O 

o 

Q 
Ixi 



U.S. DEPARTMENT OF EDUCATION 
Office of Educational Research and Improvement 
E EDUCATIONAL RESOURCES INFORMATION 
CENTER (ERIC) 



TThis document has been reproduced as 
received from the person or organization 
originating it 
□ Minor Changes have been made to improve 
reproduction Quality 

• Points of view or opinions stated in this docu- 
ment do not necessarily represent official 
OERI position or policy 



"PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC)." 



Using Coflf irmatory Factor Analysis to Study 
the Impact of Mixed Item Stems 
on a Computer Anxiety Scale 
William J. Pilotte 
and 

Robert K. Gable 
University of Connecticut 



Paper presented at the annual meeting of the 
Eastern Educational Research Association, 
Savannah, Georgia, February, 198§\ 



ERIC 



2 BEST COPY AVAILABLE 



impact of mi xed stems 

4 

A 

Abstract 

Confirmatory factor analysis CLISREL VI 3 is the method best 
suited to the comparison of measurement models when those models are 
based on a priori assumptions CHayduk, 1987; Jorskog S< Sorbom, 198£:>. 
Traditionally, positive and' negative item stems were mixed on 
affective scales to reduce response .set bias since the item pairs 
were considered to be parallel (Gable, 1986; Fleishman & Benson, 
1987; Nunnally, 1973). Recent studies indicate that -^itive and 
negative item stems may form separate factors, implying that they are 
representative of different constructs CBenson & Hocevar , 19S5; 
Pilote & Gable, 1988; Schmitt & Stults, 1985; Wright & Masters, 
1982). -In this study, the differences betwc-en positive and negative 
ite,, stems is studied using two forms of a computer anxiety scale to 
ascertain if the negation of an item produces a parallel item and to 
compare the facto, structures and measurement errors to determine if 
factor invar iance can be claimed. The results of this study *rs 
consistent with those of other researchers. One should view results 
with caution when the instrument includes mixed item stems, since the 
negation of an item tends to lead to an increase in the error 
variance associated with that item. In general, positive and 
negative forms of this scale do not meet the criteria for factor 
invar iance or for parallel tests. 



3 



impact of mi xed stems 

Using Confirmatory Factor Analysis to Study che Impact 
of Mixed Item Stems on a Computer Anxiety Scale 



The purpose of this study was to investigate the changes in 
factor structure and error variance associated with items transformed 
to a negative stem. This section focuses on thi5 measurement model, 
group compar isons, LISREL goodness of fit indicies, and the use of 
positive and negative item stems. 

Measurement Model . The measurement model used to explain the 
covariation in a set of observed variables is important since 
reliability depends on how closely the model can reproduce the 
covariance matrix CBollen, 19S2j Fleishman & Benson, 1987; Hayduk, 
1987; Long, 1983) . Confirmatory factor analysis allows for the 
testing of different measurement models based on a set of a priori 
assumptions concerning the number of restrictions placed on the scale 
items. The most restrictive model dictates that all items are 
equally accurate indicators and that the error associated with the 
individual i terns is not correlated. In the least restrictive model 
the factor loadings for the different scale items are free to va; y, 
error variance is nob constrained to be equal and fie correlations 
between the disturbances for the observed variables ara no longer 
forced to zero CFleishnan & Benson, 1987,* Kenny, 1979). Previous 
research has proven thab correlated measurement error will bias 
reliability estimates and that bhe reliability of an instrument can 
vary across subgroups (Fleishman 2/. Benson, 13S73 . The congeneric 



ERJC 



4 



i mpac t of mi x ed st ems 

model, being the least restrictive, is probably the most realistic 
model - 

Group Compar i sons . In many instances a researcher would like to 
compare different groups with respect to a certain trait. These 
types of comparisons "share the implicit assumption that the measure 
of interest assesses a common latent construct across populations" 
(Newton, Kameoka, Koelter, & Tanaka-Mat sumi , 1934, p. 100). Since 
construct equivalence is a necessary condition for cross-group 
comparisons, fact oral invar iance inust be established prior to score 
interpretation. Factor al invariance remains specific to the 
instrument and the population under study; consequently, it must be 
examined each time two or more groups are. compared (Newton et al . , 
1984). 

Linear structural relationships (LISREL) provides the flexibility 
to test different measurement models and to compare those measurement 
models for improved fit, factor invariance across groups, and equal 
error variance assumptions (Boll en, 1982; Fleishman & Benson, 1987; 
Hayduk, 1937; Joreskog Sorbom, 1986; Newton et al . , 1984). The 
LISREL measurement model specifies how hypothetical constructs are 
measured in terms of observed variables and can be used to describe 
the reliabilities and ,-cuidititfS of those observed /ariables (Jorskog 
5t Sorbom, 1986) . 

Goodness of Fit . The measurement model can be examined For 
goodness of fit using several different criteria. The X 22 measure 
indicates if the model and the set of coefficient estimates are 



5 



impact of mixed stems 
4 

consistent with the covar i ance matrix for the observed variables 
(Boll en, 1982; Hayduk, 1987; Hoelter, 1983)- Thi=s measure has the 
disadvantage of being sample size dependent CHayduk, 1987; Hocevar et 
al . , 1987; Jorskog S< Sorbom, 198S; Kroonenberg & Lewis, 1982; Long, 
1983; Marsh, 1985, 1987). Since X 2 is sample size dependent, other 
measures have been designed to address the concept of fit . BentI er 
and Bonnet £1932) have suggested a comparison between the model under 
consideration and the null model which assumes no common factors a 
CHayduk, 1987; Hocevar et al , 1987; Hoeiter, 1933; Kroonenberg & 
Lewis, 1982; Long, 1983; Marsh, 1985, 19S7; Newton et al . , 1984). 
The Bentler-Bonett Index -JC^) / # ) scales the chi -square 

between 0 and l.O', with 1.0 indicative of perfect fit (Boll en y 1982; 
Hayduk, 1987; Marsh, 1985). The Bentler-Bonett Index for acceptable 
measurement models should be greater than .90 CBollen, 19S2; 1'enny, 
personal communication, 1988). A second comparison with the null 

model is the Tucker-Lewis Index (£~jjjr> "*^-)/^^*'^" AS with the 
Bentler-Bonett Index , larger values are indicative of better mudel 
fit. Hayduk C13873 advocates the use of competing models in 
assessing the goodness of fit rather than u^ing bhtj tr adi t ionitl null 
model. Carmines and Reiver (1981) suggest using the ratio of x a to 
the degrees of freedom, with /alues between 2 and 3 being reasonable 
and indicative of model fit CHayduk, 1987; Hoelter, 1933; Marsh, 
1985). It seems that the best method to assess the fit i rs to combine 
these criteria with the normalized residuals obtained from the LI5REL 
program. The normalized residuals are "standard errors for the 



•G 



impact of mi xed .stems 
5 

estimated loadings, factor correlations and uniqueness" (Kroonenberg 

?< Lewis, 19B2, p. 69). If all of the normalized residuals are Itrss 

than 2.0, then the model appears bo adequately reproduce the given 

covariance matrix CHayduk, 19S7; Jorskog S< Sorbom, 1986). 

Po'sitive and Negative Item Stems . In developing an affective 

scale, the researchers have traditionally been advised to include an 

equal number of positi/e and negative item stems in order to reduce 

response set bias (Benson & Hocevar, 1985; Schmitt Z< Stults, 1985; 

Wright & Masters, 1982; Nunnally, 1982). In following the 

traditional advice, the researcher must assume that the items are 

parallel or at Least t au-equi val en t (Fleishman Z< Benson, 1987). For 

this assumption to hold true, the positive and negative item stems 

need to define the s Tie construct for the population under study 

(Benson Z< Hocevar, 1985). Previous studies have confirmed that 

posit i /e and negative items are not unidimensional and that a two 

factor measurement model best represents the observed covariance 

matrix, with the negatively worded items defining the second factor 

(Benson & Hocevar, 1985; Pi lot be & Gable, 1988; Schmitt S< Stults, 

1C35). In part, this must be true since studies have shown that 

wording changes can make significant differences in the factor 

structure and in the item validities (Benson Z< Hocevar, 1933j 

Bentler, Jackson & Messick, 1971; Schmitt & Stults, 1935). 
« 

High school students react differently to posit! /e and negative 
item stems and their responses are affected by the emotionality of 
the words (Simpson et al . , 197S). A controlled experiment involving 




i mpac t of mi xed stems 
*~ 

upper division undergraduate students indicated that using reverse 
scored items resulted in more student inaccuracies that were both 
practically and significantly different CSchr iesheim & Kill, 1981). 
Schriesheim and Hill also concluded that the negatively worded items 
were less valid (i.e., result in less accurate responses which 
impairs the validity of the results). The use of mi.ted item sterns to 
balance response set bias appears to be ineffective; however, Masters 
and Wright (1982) advocate the use of "For and Against statements" to 
"expose persons with unusual response tendencies" Cp. 135). 

In summary, in order to study the differences that exist between 
positive items and their negative transformation^ a measurement model 
must be established. The most realistic oeing the congeneric or 
least restrictive model. These differences can be detected using the 
LISREL VI multiple group procedure. The model comparisons must be 
made after assessing multiple goodness of fit indicies. 

Purpose . This study employed confirmatory factor analysis 
techniques to assess the issue of model fit And to cwmpare different 
plausible models based on the model's ability to reproduce the 
original covar iance matrix. First, it will address the issues of fit 
and improvement of fit. Second, this study will illustrate that 
transforming an affective item stem from positive to negative wording 
will alter the item, resulting in the emergence of a "negative" 
factor for the high school population. 




i mpact of mi xed stems 
7 

METHOD 

Sample 

A random sampl e of students att endi ng a smal 1 city high school 
was obtained using homeroom assignments. This school population is 
representative of other small city schools within the state. Although 
the sample size is small, it is within acceptable bounds for a 
confirmatory LISREL analysis. 
Instrument at i on 

Three parallel forms of an instrument to measure c -unpLtber anxiety 

were developed to study the impact of item phrasing on the validity 

of a Likert-type affective scale used in a high school setting. The 

three forms di'ffered only in phrasing. The first scale was composed 

of nine items that indicate computer anxiety as defined below. 

An unpleasant, emotional state marked by worry, 
apprehension and attention associated with thinking 
about, using, or being exposed to a computer. 

The scale resulted from items generated by the first author and 

rated for content validity by seven experts in the field of 

computer education and high school students. The experts wer 

sent a list of statements, a short review of the literature, and 

directions for rating the items. Some statements had '" o be 

eliminated based on the raters' comment s . A specific *.*ampi ^ from 

the form "Only siflcirt people can master a computer" elicited the 

additional "...and I am smart so" or "I am not smart so" which is 

more indicative of the students' general academic self confidence 

than of computer anxiety. 



ERJC 



impact of mixed stems 
3 

The construct val i di dty of this form was assessed by a factor 
analysis with varimax and oblimin rotations, with the number of 
factors extracted determined by Kaiser's criterion. The original 
instrument contained 10 items on a 5-point Likert scale, with 1 
assigned to strongly disagree, 5 to strongly agree and 3 to 
neutral. The factor structure from this exploratory analysis was 
indicative of a one factor structure when only 9 items were used. 

The second form was devised by negating each iten from the 
original form to provide a parallel instrument. Fi /e of these 
statements were negated using the word not, while the remaining 
four statements were negated by changing the target word to one 
opposite in meaning. Traditionally, these items should reflect 
computer anxiety when reverse scored. A third form ,C, consisted 
of 5 items from Form A and 4 items from Form B. An example of 
each type of item is presented below: 

(computer an.ciety) I feel threatened b> computers. 

Cnonanxious) I do not feel threat ed by computers. 

(computer anxiety) I feel stupid around computer*. 

Cnonanxious) I feel intelligent around computers. 

Anal yses 

7!i© three forms of Ihe instrument wsra combined into ^a^ayes 
and distributed by the building principal to 2? different 
homerooms, equally divided among grad/.^ 3 through 12. Each 
student responded anonymously to on« of the three forms. Reverse 
scoring was employed for all ite^is on Form 3 and for the 



10 



i mpact of mi xed stems 
9 

appropriate items on Form C, such that a response of 5 was 
indicative of computer anxiety. 

A measurement model was developed for Form A and Form B. A 
traditional null model with no common factors and a competing 
model were also developed for each form. Ths Bentler-Bonett and 
Tucker -Lewis Indicies were calculated and used in conjunction with 
the chi -square to degree of freedom ratio, c hi -square statistic, 
and normalized residuals in order to assess the goodness of fit. 
The two models were analyzed for factor invar ianca using the 
LISREL VI multiple groups procedure. The issue of model fit was 
addressed using the chi-square statisitic and normalized 
residuals. The chi-square differencing technique was used tu !:es'; 
for a significant increase in fit between nested models. 

Resul t s 

Reliability 

7l\& three forms* were analyzed for the degree internal 
consistenc> using Chronbach ' s Alph* and the results ar « p. ea«nt«d 
in Table i. 



In^^r t TaLlu 1 about her 



Fac t or 3 t r uc t ur a 

A confirmatory factor si* (LISREL VI) used to de?;ei mine 

if the po^iti 'c and negative 1 1 em jtsms were m^auL.* 2 w-j the s^ro^ 
construe I indicated a two facto, uiodel wan pr e f ^<r *bl ^ 'Pi lot 1 2 ?< 



ii 



i mpac t or mi ><ed stems 
10 

Gable, 193S>. The factor loadings and factor correlations are 
included in Table 2. 



Insert Table 2 about here 



Prior to ' rsting for factor invariance, measurement modul: 
ware independently developed for Form A and For ,7 B- The ite^s 
were worded sucr that agreement was indicative of computer anxiet- 
for Form A and i-dcb of anxiety for Form B. The consistency of 
Item stems used on each form suggested that those items should 
define a single factor. Since an a priori factor structure was to 
be tested, confirmatory factor analysis (LISPEL VI? was used to 
test and refine the measurement" model s. The covari<?*.e matrix for 
the students' responses were input in all cases ©nd a onf factor 
solution speci f i ad. Psi was set equal to 1 and the program va» 
jl 1 . w«d to estimate the f^itor loadings arid the di £t j>* bancts^ 
associated with each item. The final measurement mo del was 
refined to allow fo. c or reicifcud .r.ea-ur-iiTi^r.t error bet'.'wn pairs of 
statements, whi zh theoretical 1> shai a some common error variance. 
The inclusion of the correlated measurement ^rr--< i 3 necessary 
wrion h e omission of * common cau^e c « i • L r iLu.^s lo t!;*.- ;neasur "^ifient 
zrror fHaydul., 1337)- The factor Icz.clr^z and ^c- r ^lafci * -«r 
Form A are found in Table 3. 



9 

ERLC 



12 



i rnp ac t of mi x ed stems 



V 



11 



Insert Table 3 about here 



Model Fit 

In testing the measurement model the null hypothesis is* "There 
is no significant difference between the input data and the model" 
(Benson, 13S7; Jorskog S< Sorbum, 1986; Marsh, 1985). One 
indication of adequate model fit is a nonsignificant chi -square. 
The final measurement model for Form A had a X 2 of 19.61, Cdf = 
23, p = .665), which seems to indicate that the model reproduces 
the original covariance matrix. The normalized residuals for the 
measurement model were inspected for values greater than 2*0 in 
order to ascertain if any entry in -the original covariance matrix 
could not be accounted for by the given model. The largest 
normalized residual was .63, which supports the decision t :■ accept 
this measurement model. The chi -square to degrees of freedom 
ratio, Bent ier-Bonett Index, and Tucker-Lewis Index also indicate 
that the given measurement model has adequate fit. The results of 
these tests can be found in table 4* 



Insert Table 4 about hers 



Jhti .Tie as ur srnenfc lAOdel Tor F«-«r m A was also »: 



omparad to 



fig model . 



A previou 



s study in 



i cated that the two 



model, positive item stems on one factor and 



i negative Itsm sterna 



ERLC 



13 



i mp act of mi xed stems 
12 

on the second factor, fit the data better than the single factor 
model (Pilotte S< Gable, 1988). Consequently, the competing model 
was defined by assigning the same items to the same two factors. 
Failing to accept this two factor model also supports the previous 
conclusion that the item stem defined the factor (for a more 
complete discussion see Pilotte & Gable, 1988). The accepted 
method for comparing nested models is chi -square differencing 
(Benson, 1987; Boilen, 1982; Hayduk, 19S7; Marsh, 1935) . This 
method states that ~%*L%* _ wifch de 5 rees reedom equal to 

df A - df s with model 1 being the most restrictive model. 

This competing model was not capable of reproducing the 
original covariance matrix as evidenced by the 20 normalized 
residuals that were greater than 2.0. In this case the chiysquare 
differencing test yielded a chi-square of 235.33, Cdf - 45, which 
is highly significant. Therefore, the one factor model being 
tested better explains the data. 

The same procedures were applied to Form B to obtain and test 
the measurement model, The factor loadings and error variances 
are found in Table 3. 



Insert Table 5 about here 



The measurement model for Form B has a chi-square statistic of 
26.60, Cdf = 21), which is indicative of fit bul'w^sn -h* tf'-Jel and 
the covariance matrix*. All of Vne normalised residual = *ere less 



ERIC 



14 



impact of mi xed stems 
13 

than 2.C, which indicates that the model can adequately reproduce 
all the cells within the original covariance matrix. The other 
tests for goodness of fit, see Table 6, support the hypothesis 
that this measurement model is consistent with the original data. 



Insert Table 6 about here 



The chi-square difference test for Form B, with respect to a 
two factor model, yielded a value of 139.33, Cdf = 6), which is 
hiyhiy significant. This indicates that the model being tested 
provides better fit. The LISREL program estimate of the 
correlation between the two factors was .SI. This high a 
correlation also supports the conclusion to use a one factor 
model - 

Fac tor In var i an c e 

The measurement models for Fern A and Form B were tested for 
factor invariance. In constructing Form 3, each item from Form A 
was negated in an attempt bo establish parallel items. Factor 
invariance is a necessary condition for parallel items (Fleishman 
& Benson, 1337). In testing for invariance both forms ar s 
simultaneously fib bo the same model, cinil.' aining -*o».<~ ^r a.aebsr s 
to be equal CHayduk, 13S7; Jorskog 3< Sor bom, 1S3S; Marsh, 1335?. 
The comparison was made between the model that allowed fac ';•:»■" 
loadings to vary and the ifiodel with fac lot loadings constrained !:c 
be equal. This type of comparison is in keeping with the 



ERLC 



15 



i mpac t of mi ycsd st ems 
14 

sequential rules established by Jorskog and Sorbom CJorskog Z< 
Sor bom, 1986; Kenny, personal communication, 1988; Marsh, 1335, 
1337). 

The least restrictive model allowed bhe factor loadings to 
vary over groups. This model has a chi— square of 4S.2S, Cdf = 
44), which is indicative of adequate* fit. The normalized 
residuals were all less than 2-0 indicating that this model 
reproduces the original covar i ancs matrix. A competing two factor 
model yielded a chi -square of 421.35, Cdf = 54), which is 
indicative of poor fit. The chi -square difference of 375, 5S, "df 
= 105, indicates the one factor model gives a better fit. 

The model that forced respective items on Form A and Form B to 
have eqaul factor, loadings was then tested. This model resulted 
in a chi-square of 37.21, Cdf = 53, p = .002), indicating lack of 
fit. Analysis of the normalized residuals indicated that this 
model failed bo reproduce 11 entries of the original covar iance 
matrix for Form B. Comparing this model with the previous inodel 
resulted in a chi -square difference of 40.95, Cdf = S> 7 suggesting 
bhat the less restrictive model is best. The goodness of fit 
statistics summarized in Table 7 all indicate that the less 
restrictive model shows s, b.-b'^r fit with the original da'.a. 



Insert Table 7 about here 



ERLC 



impact of mixed stems 

Conclusions 

This study provides more evidence that the use of reverse 
scored items on an affective scale can alter the students' 
response to an item. The two scales were constructed to form 
parallel items. A necessary condition for parallel items 
factor invariance which was tested by the multiple groups 
procedure of LISREL VI. The analysis clearly indicated that 
respective items on the two scales were be^t represented by 
different factor loadings. This result is consistent with earlier 
research (Benson & Hocevar, 1937; Pilotte 2< Gable, 1333? . 

The LISREL model provides a measure of generalized reliability 
for the model, the total coefficient of determination, which 
decreased from .99 for Form A to .37 for Form B. This seems to 
support Schriesheim and Hill's C19S13 study that indicated that 
negatively phrased items tend to be less valiu partially because 
of increased student inaccuracies (Benson & Hocevar , 1935; 
Schriesheim 2< Hill, 1931). Further analysis of these data is 
necessary to study the effect that the positive/negative 
transformation may have had on item reliabilities. 

The measurement models for Form A and Form B include error 
variance estimates. The error variancies for the negatively 
phrased items, Form 3, appear to be higher than for the . r aspect ive 
item on Form A. This result is also consistent with previous 
research (Benson, 1937; Benson Si Hocevar, 1935; Schr ieshein & 
Hill, 1931?- Since ',he results do not supper t t'.e hypothesis of 



ERIC 



17 



impact of mi xed stems 
IS 

factor invar iance, the inclusion of mixed item stems on an 
affective instrument should be viewed with caution. A nnore 
complete study of the high school population needs co be 
undertaken to ascertain the extent to which reverse scored items 
affect the item and instrument reliabilities and the factor 
structures of the instrument itself. 



ERIC 



18 



impact of mixed stems 
17 

Table 1 



Internal Consistency of the Computer Anxiety Scales 



Form Item Classi fication Alpha 



A Computer Anxiety .95 (N=94) 

B Reverse Score .87 (N=90) 

C Mixed Stems .73 (.14=87) 

Note. From Pilotte, W. 8c Gable, R. (1988, November?. The impact of 
positive and negative item stems on the validity of a computer 
anxiety scale. Paper presented at the annual meeting of the 
Northeastern Educational Research Association, Ellenville, N.Y. 



0 

ERJC 



19 



impact of mixed stems 
IS 

Table 2 

Factor Loadings and Factor Correlations for For in C 



Factor 1 Factor 2 





C 1 fl iiCr ') 

\ 1 \J m *T Zf J 


00 




.39 


(11.31) 


. 00 




.90 


(8.69) 


.00 




. 00 




.69 


(4.80) 


. 00 




. 94 


(5.36) 


. 00 






(2.51) 




(3.42) 


.00 




. 00 




.40 


(2. 70) 


.29 


(2. 37) 


. 00 





note: All loadings significant using t-values from LISREL VI 
program- T-values given in par sntheses next to f ac tor loadings. 



Fsi Factor 1 . Factor 2 
.24 CI. 37) 

Note. From Pilotte, W. & Gable, R. C1S8S, November). The impact of 
positive and negative item stems on the validity of a computer 
anxiety scaltf. Paper presented at the annual meeting of this 
Northeastern Educational Research Association, El 1 en vi 1 1 a, N.Y. 



ERIC 



20 



impact of mi xed stt 
19 



Table 3 

Factor Loadings and Item Error Variance Form A 
Loadi ng Error Var i ance 



.94 


CI 1.3) 


. 19 


CS. 0) 


.87 


C11.8) 


. 12 


CS. 2) 


.97 


<io.4:> 


> UvJ 


CS.S) 


.94 


a 1.6) 


. 16 


C5. 3) 


1.0 


(11.5) 


. 2*i 


C5. 4) 


1.0 


(11.8) 


. 18 


C5. 7) 


.95 


C 1 1 . 2) 




CS. 5) 


.75 


CS. S) 


.90 


CS.3) 


.57 


C6.S) 


■ JvJ 


CS. S) 



note: All loadings and error /ariances are significant using 
t-values from LISREL VI program. T-values given in parentheses 
next bo each value. 



X a = 19.61; p = .£65; total coefficient of determination = .990 



21 



Table 4 



i mpact of mi xsd stems 

20 



Goodness of Fit Indicies: Form A 



Index Value 



X* /df -35 
Bentler-Bonett . 33 

Tuc ker -Lewi s 1.0 



» 22 
ERIC 



impact of mixed stems 
21 

Table 5 

Factor Loadings and Error Variances: Form B 



Loadi ng 


Frr or 


Var i anc e 


.88 


(.'£. 3) 


1.1/ 


I. D . 1 .' 


.98 


(9. 4) 


. 41 


\ 4 . J. > 


.99 


(9.4) 




C 4 . S 1 


.94 


(7.21 


. 91 


f. 5 . J J 


.91 


(7.35 


.34 


(5.9.1 


.51 


C5.8) 


. S7 


<G. 31 


.26 


CI. 9.1* 


1.37 


CS.71 


.72 


C5.9.1 


. 96 


<£. 31 


.85 


C8.5) 


.45 


C5.81 


Note: All loadings 


si gnl f icanb 


using fc -values 


from L.ISREL VI 



program unless marked by T-values given in parentheses r.e«t to 

val ues. 



23 



impact of mi ag6 stems 



Table <=- 

Goodness of Fit Indicies: Form B 



Index Value 



X 2 /df 1.26 
Bentler-Bonett .97 
Tuc ker-Lewi s . 99 



ERJ.C 



24 



impact of mixed stems 

Table 7 

Goodness of Fit Indicies: Multiple Groups 

Invariant 

X~ /df 1.6 
Bentler-Bonett . 94 

Tucker -Lewi s . 93 

Unc onst r ai ned 

X s /df 1-05 
Bentler-Bonett . 97 

Tucker -Lewis i . 0 



0 

ERLC 



25 



impact of mi xed stems 
24 



References 



Benson, J. (19375. Decting item bias in affective scales. 

Educational and Psychol oqical Measurement , 47 ; 55-S7 . 
Benson} J. S< Hocevar, D. (19S5). The impact of item phrasing on 

the validity of attitude scales for elementary school 

children. Journal of Educabional Measurement s 22, 231-240. 
Bollen, K. (1982). A confirmatory factor analysis of air quality* 

Eval uat i on Revi ew , S , 52 1 -535 • 
Fleishman J. & Benson, J. C 13875 . Using LISREL to evaluate 

measurement models and scale reliabilities. Educational and 

Psychological Measurement , 47, 325-333 . 
Gable, R. <1986>. Instrument development i n the affective domain. 

Boston: Klumer Nijhoff. 

Hayduk, L. <1937>. St ructual equation jiodelinu with lisrel: 

essentials and advances , Baltimore: John Hopkins University 
Press. 

Hocevar, D. , Khattab, A., fr Michael, W. (1337:). Significance 

test ing and ef f ici ency in 1 i srel measur - snt moc'el 2 . 

Educabional and Psychological Mgasu.-eiTient , 47, 45-43. 
Hoe Iter , J. -13335. The anal y £ of covaridncs structures. 

Sociological Methods and .^eard;, il, 32E>344. 
Jorskog K. & Soriom, D. ^1333" . LISREL VI anal /si s of linear g 

structural relationships by the method of T*s.-;i muffi 1 i ihecd. 

Indiana: Scientific Software, Inc. 



ERLC 



26 



impact of mixed stems 
25 

Kenny, D. A. (1979). Correlation and causality . New York: John 
Wi 1 ey. 

Kroonenberg, P. & Lewis, C. (1982). Methodological issues in the 

search for a factor model : Exploration through c on f i r mat ion. 

Journal of Educational Statistics , 7 , 69-39 . 
Long, J- S. (1983). Confirmatory factor analysis: a preface to 

LISREL , Beverly Hills, CA: Sage. 
Marsh, H. (1985). The structure of mascul i ni t y/ f emi ni ni by s an 

application of confirmatory factor analysis to higher-order 

factor structures and factorial invar iance. Mul bivariabe 

Behavi oral Research , 20, 427-449. 
Marsh, H. (1987). The hierarchical structure of self-concept and 

the application of hierarchical confirmatory factor analysis. 

Journal of Educational Measurement . 24, 17-40. 
Newton, R. , Kameoka, V. , Hoelter, J., Tanaka-Mat sumi , J. £19845. 

Maximum likelihood estimation of factor structures of anxiety 

measures: a multiple group comparison. Educati onal and 

Psychol epical Measurement , 44 , 1 79- 1 93 . 
Nunnally, J. (1978). Psychometric theory (2nd ed.). New York: 

McGr aw-Hi 1 1 • 

Pilot'is, W. S< Sable, R. (1988, November). The impact of positive 

ard negative i^em stems on the validity of a compubar an ;ie':y 
scale . Paper presented at the annual meeting of the 
Northeastern Educational Research Associ ation, El 1 eny i lie, 
N. Y. 



27 



i 

impact of mi xed stems 
2S 

Schmitt, N. & Stults, D. C19S5). Factors defined by negatively 
keyed items: the result of earless respondents? Appl ied 
Psychol oqi cal measurement , 4, 367-373 . 

Schriesheim, C. & Hill, K. Cl'SSl). Controlling acquiescence 

response bias by item reversals: the effects on question 
validity. Educational and Psychol oqi cal Measurement f 41., 
1101-1114. 

Simpson, R. , Rentz, R. , & Shrum, J. C1376). Influence of 

instrument characteristics on student responses in attitude 
assessment. Journal of Research in Science Teaching, 13, 
275-281. 

Wright, B. Z< Masters, M. (1982). Rating scale analysis . Chicago: 
Mesa Press. 



